
(NASA-CR-179822) NUMERICAL ALGORITHMS FOR N87-11542 

FINITE ELEMENT CCMPDTATICNS CN CONCURRENT 
FEOCESSOES Semiannual Report, 1 Mar. - 31 

Auq. 1986 (Virginia Uriv.) 5 p CSCL 12A Unclas 

G3/64 43789 


Report No. U V A/5281 90/AM87/1 12 
September 1986 



imiii 



SCHOOL OF ENGINEERING AND 
APPLIED SCIENCE 


DEPARTMENT OF APPLIED MATHEMATICS 

UNIVERSITY OF VIRGINIA 
CHARLOTTESVILLE, VIRGINIA 22901 



A Semi-Annual Report 

For the period March 1, 1986 - August 31, 1986 
Grant No. NAG- 1-46 

NUMERICAL ALGORITHMS FOR FINITE ELEMENT COMPUTATIONS 
ON CONCURRENT PROCESSORS 

Submitted to: 

National Aeronautics and 
Space Administration 
Langley Research Center 
Hampton, VA 23665 

Attention: Dr. Olaf Storaasli 

SDD M/S 246 

Submitted by: 

J. M. Ortega 
Professor and Chairman 


Department of Applied Mathematics 
SCHOOL OF ENGINEERING AND APPLIED SCIENCE 
UNIVERSITY OF VIRGINIA 
CHARLOTTESVILLE, VIRGINIA 


Report No. UVA/528190/AM87/ 1 12 


Copy No. 


September 1986 


This report summarizes work performed under NASA GRANT NAG 1-46 during 
the period March 1, 1986 to August 31, 1986. 

Prof. Ortega has been supervising the research of several graduate students who 
have been at least partly supported by this grant or by NASA Grant NAG1-142. 
This work of all of them relates to the general goals of this project. 

Charles Romine completed his Ph.D. degree in August. He has given a detailed 
analysis of the so-called ijk forms of Gaussian elimination and Choleski factoriza- 
tion on concurrent processors. His work was motivated by the Finite Element 
Machine at NASA-Langley but pertains more generally to any message-passing 
machine with a global bus. The main results are the optimality of a particular 
form of Gaussian elimination as well as a detailed complexity analysis of this 
form. Similar, but not as complete, results are given for Choleski factorization. In 
addition, results are obtained for a column-oriented triangular equation solver which 
shows a much higher degree of parallelism than had been assumed possible; this 
result has been accepted for publication in Parallel Computing in a joint paper by 
Romine and Prof. Ortega. The analysis of the ijk forms will be the subject of 
another joint paper following a prior one by Prof. Ortega on the ijk forms for vec- 
tor computers. 

Eugene Poole completed his Ph.D. in May, 1986, with a thesis on the vectori- 
zation of the Incomplete Choleski Conjugate Gradient method on the Cyber 205. 
The implementations use multicoloring orderings and matrix multiplication by diago- 
nals and give a very high degree of vectorization. The thesis has been published as 
a NASA contractor report and a paper by Poole and Ortega has been submitted for 
journal publication. 



Andrew Cleary and David Harrar spent the summer at NASA-Langley imple- 
menting various versions of Gaussian elimination and Choleski factorization on the 
FLEX/32. A global memory version of Gaussian elimination for banded matrices 
and a corresponding global memory Choleski code are running and producing 
surprisingly large speed-ups on preliminary runs. A local memory Choleski code 
using profile storage is also running but hasn’t been tested as extensively. All of 
these codes are designed to be used in the environment of NICE/SPAR, and the 
plan is to begin running different versions of the blade stiffened panel focus prob- 
lem. A report on progress to date is now in preparation. 

Courtenay Vaughan continued his development of SSOR preconditioned conju- 
gate gradient methods, primarily on an Intel iPSC Hypercube at Oak Ridge National 
Laboratory, where he spent the summer, but also on the FLEX/32 at NASA- 
Langley. His model problems to date have included a generalized Poission equation 
and a plane stress problem but he will begin work shortly on the panel focus 
problem. 

Eugene Poole has begun work on the panel focus problem on NICE/SPAR. His 
immediate goal is to extract stiffness matrices of several sizes to be used by Cleary 
and Vaughan in their direct and iterative codes. He will also use his multicolored 
ICCG method on this focus problem on the FLEX/32, the CRAY-2 and the CRAY 


X-MP. 
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